Reducing one-to-many problem in Voice Conversion by equalizing the formant locations using dynamic frequency warping

نویسنده

  • Seyed Hamidreza Mohammadi
چکیده

In this study, we investigate a solution to reduce the effect of oneto-many problem in voice conversion. One-to-many problem in VC happens when two very similar speech segments in source speaker have corresponding speech segments in target speaker that are not similar to each other. As a result, the mapper function usually oversmoothes the generated features in order to be similar to both target speech segments. In this study, we propose to equalize the formant location of source-target frame pairs using dynamic frequency warping in order to reduce the complexity. After the conversion, another dynamic frequency warping is further applied to reverse the effect of formant location equalization during the training. The subjective experiments showed that the proposed approach improves the speech quality significantly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

不需平行語料而基於共振峰與線頻譜頻率映對之語者特質轉換系統 (A Voice Conversion System based on Formant and LSF Mapping without Using Parallel Corpus) [In Chinese]

Voice conversion has been used in many applications. The methods based on vector quantization codebook and Gaussian mixture models need dynamic time warping on parallel sentence corpus for generating mapping functions. Recent study tries to use less training data, and even without parallel sentence corpus. This paper presents a voice conversion method without using parallel sentence corpus. It ...

متن کامل

Using Context-based Statistical Models to Promote the Quality of Voice Conversion Systems

This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...

متن کامل

Probability models of formant parameters for voice conversion

This paper explores the estimation and mapping of probability models of formant parameter vectors for voice conversion. The formant parameter vectors consist of the frequency, bandwidth and intensity of resonance at formants. Formant parameters are derived from the coefficients of a linear prediction (LP) model of speech. The formant distributions are modelled with phonemedependent two-dimensio...

متن کامل

Robot Arm Performing Writing through Speech Recognition Using Dynamic Time Warping Algorithm

This paper aims to develop a writing robot by recognizing the speech signal from the user. The robot arm constructed mainly for the disabled people who can’t perform writing on their own. Here, dynamic time warping (DTW) algorithm is used to recognize the speech signal from the user. The action performed by the robot arm in the environment is done by reducing the redundancy which frequently fac...

متن کامل

ICA 2010 paper

The automatic speech recognition (ASR) under noisy environments is focused as one of the challenging topics. Especially, the observed speech under noisy environments much distorts compared with neutral observed speech under quiet one. This distortion is called Lombard effects, and ASR performance degrades by them. They should strongly occur subject to no auditory feedback for speaker. In conven...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1510.04205  شماره 

صفحات  -

تاریخ انتشار 2015